Questions Are All You Need to Train a Dense Passage Retriever

نویسندگان

چکیده

Abstract We introduce ART, a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled data. Dense is central challenge open-domain tasks, such as Open QA, where state-of-the-art methods typically large supervised datasets with custom hard-negative mining and denoising of positive examples. in contrast, only requires access to unpaired inputs outputs (e.g., questions potential answer passages). It uses passage-retrieval scheme, (1) an input question used retrieve set evidence passages, (2) the passages are then compute probability reconstructing original question. Training based on reconstruction enables effective unsupervised learning both passage encoders, which can be later incorporated into complete QA systems without further finetuning. Extensive experiments demonstrate ART obtains results multiple benchmarks generic initialization from pre-trained language model, removing need data task-specific losses.1 Our code model checkpoints available at: https://github.com/DevSinghSachan/art.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Linear Regions Are All You Need

The type-and-effects system of the Tofte-Talpin region calculus makes it possible to safely reclaim objects without a garbage collector. However, it requires that regions have last-in-first-out (LIFO) lifetimes following the block structure of the language. We introduce λ, a core calculus that is powerful enough to encode Tofte-Talpin-like languages, and that eliminates the LIFO restriction. Th...

متن کامل

All You Need Is Compassion

The paper presents a new deductive rule for verifying response properties under the assumption of compassion (strong fairness) requirements. It improves on previous rules in that the premises of the new rule are all first order. We prove that the rule is sound, and present a constructive completeness proof for the case of finite-state systems. For the general case, we present a sketch of a rela...

متن کامل

CNN Is All You Need

CNNs have been successfully used in audio, image and text classification, analysis and generation [12,17,18], whereas the RNNs with LSTM cells [5,6] have been widely adopted for solving sequence transduction problems such as language modeling and machine translation [19,3,5]. The RNN models typically align the element positions of the input and output sequences to steps in computation time for ...

متن کامل

All You Need Is Mentorship

I find it humbling to confess that most of the truly original ideas that have driven my research group’s agenda over four decades of time have come, not from my own brain, but instead from the minds of my trainees, both graduate students and post-docs. This on its own might explain why I, rather selfishly, have given them long leashes, allowing them to strike out on their own and craft their ow...

متن کامل

Domain-Specific Mashups: From All to All You Need

Last years, aside the proliferation of Web 2.0, we assisted to the drastic growth of the mashup market. An increasing number of different mashup solutions and platforms emerged, some focusing on data integration (a la Yahoo! Pipes), others on user interface (UI) integration and some trying to integrate both UI and data. Most of proposed solutions have a common characteristic: they aim at provid...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Transactions of the Association for Computational Linguistics

سال: 2023

ISSN: ['2307-387X']

DOI: https://doi.org/10.1162/tacl_a_00564